262 research outputs found

    Structure and Complexity in Planning with Unary Operators

    Full text link
    Unary operator domains -- i.e., domains in which operators have a single effect -- arise naturally in many control problems. In its most general form, the problem of STRIPS planning in unary operator domains is known to be as hard as the general STRIPS planning problem -- both are PSPACE-complete. However, unary operator domains induce a natural structure, called the domain's causal graph. This graph relates between the preconditions and effect of each domain operator. Causal graphs were exploited by Williams and Nayak in order to analyze plan generation for one of the controllers in NASA's Deep-Space One spacecraft. There, they utilized the fact that when this graph is acyclic, a serialization ordering over any subgoal can be obtained quickly. In this paper we conduct a comprehensive study of the relationship between the structure of a domain's causal graph and the complexity of planning in this domain. On the positive side, we show that a non-trivial polynomial time plan generation algorithm exists for domains whose causal graph induces a polytree with a constant bound on its node indegree. On the negative side, we show that even plan existence is hard when the graph is a directed-path singly connected DAG. More generally, we show that the number of paths in the causal graph is closely related to the complexity of planning in the associated domain. Finally we relate our results to the question of complexity of planning with serializable subgoals

    Learning to Coordinate Efficiently: A Model-based Approach

    Full text link
    In common-interest stochastic games all players receive an identical payoff. Players participating in such games must learn to coordinate with each other in order to receive the highest-possible value. A number of reinforcement learning algorithms have been proposed for this problem, and some have been shown to converge to good solutions in the limit. In this paper we show that using very simple model-based algorithms, much better (i.e., polynomial) convergence rates can be attained. Moreover, our model-based algorithms are guaranteed to converge to the optimal value, unlike many of the existing algorithms

    On Partially Controlled Multi-Agent Systems

    Full text link
    Motivated by the control theoretic distinction between controllable and uncontrollable events, we distinguish between two types of agents within a multi-agent system: controllable agents, which are directly controlled by the system's designer, and uncontrollable agents, which are not under the designer's direct control. We refer to such systems as partially controlled multi-agent systems, and we investigate how one might influence the behavior of the uncontrolled agents through appropriate design of the controlled agents. In particular, we wish to understand which problems are naturally described in these terms, what methods can be applied to influence the uncontrollable agents, the effectiveness of such methods, and whether similar methods work across different domains. Using a game-theoretic framework, this paper studies the design of partially controlled multi-agent systems in two contexts: in one context, the uncontrollable agents are expected utility maximizers, while in the other they are reinforcement learners. We suggest different techniques for controlling agents' behavior in each domain, assess their success, and examine their relationship.Comment: See http://www.jair.org/ for any accompanying file

    CP-nets: A Tool for Representing and Reasoning withConditional Ceteris Paribus Preference Statements

    Full text link
    Information about user preferences plays a key role in automated decision making. In many domains it is desirable to assess such preferences in a qualitative rather than quantitative way. In this paper, we propose a qualitative graphical representation of preferences that reflects conditional dependence and independence of preference statements under a ceteris paribus (all else being equal) interpretation. Such a representation is often compact and arguably quite natural in many circumstances. We provide a formal semantics for this model, and describe how the structure of the network can be exploited in several inference tasks, such as determining whether one outcome dominates (is preferred to) another, ordering a set outcomes according to the preference relation, and constructing the best outcome subject to available evidence

    Partial-Order Planning with Concurrent Interacting Actions

    Full text link
    In order to generate plans for agents with multiple actuators, agent teams, or distributed controllers, we must be able to represent and plan using concurrent actions with interacting effects. This has historically been considered a challenging task requiring a temporal planner with the ability to reason explicitly about time. We show that with simple modifications, the STRIPS action representation language can be used to represent interacting actions. Moreover, algorithms for partial-order planning require only small modifications in order to be applied in such multiagent domains. We demonstrate this fact by developing a sound and complete partial-order planner for planning with concurrent interacting actions, POMP, that extends existing partial-order planners in a straightforward way. These results open the way to the use of partial-order planners for the centralized control of cooperative multiagent systems

    On Graphical Modeling of Preference and Importance

    Full text link
    In recent years, CP-nets have emerged as a useful tool for supporting preference elicitation, reasoning, and representation. CP-nets capture and support reasoning with qualitative conditional preference statements, statements that are relatively natural for users to express. In this paper, we extend the CP-nets formalism to handle another class of very natural qualitative statements one often uses in expressing preferences in daily life - statements of relative importance of attributes. The resulting formalism, TCP-nets, maintains the spirit of CP-nets, in that it remains focused on using only simple and natural preference statements, uses the ceteris paribus semantics, and utilizes a graphical representation of this information to reason about its consistency and to perform, possibly constrained, optimization using it. The extra expressiveness it provides allows us to better model tradeoffs users would like to make, more faithfully representing their preferences

    Attentive Neural Architecture Incorporating Song Features For Music Recommendation

    Full text link
    Recommender Systems are an integral part of music sharing platforms. Often the aim of these systems is to increase the time, the user spends on the platform and hence having a high commercial value. The systems which aim at increasing the average time a user spends on the platform often need to recommend songs which the user might want to listen to next at each point in time. This is different from recommendation systems which try to predict the item which might be of interest to the user at some point in the user lifetime but not necessarily in the very near future. Prediction of the next song the user might like requires some kind of modeling of the user interests at the given point of time. Attentive neural networks have been exploiting the sequence in which the items were selected by the user to model the implicit short-term interests of the user for the task of next item prediction, however we feel that the features of the songs occurring in the sequence could also convey some important information about the short-term user interest which only the items cannot. In this direction, we propose a novel attentive neural architecture which in addition to the sequence of items selected by the user, uses the features of these items to better learn the user short-term preferences and recommend the next song to the user.Comment: Accepted as a paper at the 12th ACM Conference on Recommender Systems (RecSys 18

    Interfaces, Confinement, And Resonant Raman Scattering In Ge/si Quantum Wells

    Get PDF
    We address the question of confinement of the Ge-Ge mode in five-monolayer-Ge single and multiple quantum wells. Using Raman scattering, our data show strong dependence of the interface quality on the number of quantum wells and thereby on the confinement of both the phonons and the electronic states in the Ge wells. The dependence of line shape and peak position of the Ge-Ge Raman line with laser photon energy gives a clear indication of the existence of terraces in the interfaces of the Ge/Si multiple quantum wells. © 1995 The American Physical Society.5124178001780

    Empowerment for Continuous Agent-Environment Systems

    Full text link
    This paper develops generalizations of empowerment to continuous states. Empowerment is a recently introduced information-theoretic quantity motivated by hypotheses about the efficiency of the sensorimotor loop in biological organisms, but also from considerations stemming from curiosity-driven learning. Empowemerment measures, for agent-environment systems with stochastic transitions, how much influence an agent has on its environment, but only that influence that can be sensed by the agent sensors. It is an information-theoretic generalization of joint controllability (influence on environment) and observability (measurement by sensors) of the environment by the agent, both controllability and observability being usually defined in control theory as the dimensionality of the control/observation spaces. Earlier work has shown that empowerment has various interesting and relevant properties, e.g., it allows us to identify salient states using only the dynamics, and it can act as intrinsic reward without requiring an external reward. However, in this previous work empowerment was limited to the case of small-scale and discrete domains and furthermore state transition probabilities were assumed to be known. The goal of this paper is to extend empowerment to the significantly more important and relevant case of continuous vector-valued state spaces and initially unknown state transition probabilities. The continuous state space is addressed by Monte-Carlo approximation; the unknown transitions are addressed by model learning and prediction for which we apply Gaussian processes regression with iterated forecasting. In a number of well-known continuous control tasks we examine the dynamics induced by empowerment and include an application to exploration and online model learning
    corecore